Finite size effects on calorimetric cooperativity of two-state proteins 
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Finite size effects on the calorimetric cooperatity of the folding-unfolding transition in two-state 
proteins are considered using the Go lattice models with and without side chains. We show that for 
models without side chains a dimensionless measure of calorimetric cooperativity K2 defined as the 
. . . ratio of the van't Hoff to calorimetric enthalpy does not depend on the number of amino acids A'^. 

I The average value Tti ~ | is lower than the experimental value K2 ~ 1- For models with side chains 

. K2 approaches unity as k,2 ~ N'^ , where n ~ 0.17. Above the critical chain length Nc « 135 these 

' models can mimic the truly all-or-non folding-unfolding transition. 

(N 
o 

D ■ I. INTRODUCTION 

If^ \ Single domain globular proteins, which are finite sized objects, undergo remarkably cooperative transitions from an 
ensemble of unfolded states to well ordered folded (or native) states as the temperature is lowered"'^. In many cases, 
the transition to the native state takes place in an apparent two-state manner, i.e. the only detectable species are 
the native (more precisely, the ensemble of conformations belonging to the native basin of attraction^) or unfolded 
stateSi^. In order to characterize the two-state folding one can use the dimensionless quantity 

.2 ■ ^2 = AH,h/AH,al , (1) 

where AHyh = '^Tmaxy/ ksCpiTmax) and AHcai = Cp{T)dT , are the van't Hoff and the calorimetric enthalpy, 
respectively, Cp{T) is the specific heat. K2 may be considered as a measure of the calorimetric cooperativity. Since 
real globular proteins have K2 very close to unity (chymotrypsin inhibitor 2 is a prime example^) it was proposed thali^ 
K2 ~ 1 can serve as one of requirements for realistic models of proteins. There are technical problems in evaluating 
^ ' K2 using experiments or computations. Inadequate treatment of baseline subtractions in Cp{T) obscures estimates of 
K2. As a result it is possible that even sequences with K2 ~ 1 may not clearly be two-state folders. Nevertheless, k,2 
or related measures have often been used as a measure of calorimetric cooperativity. 

In series of works^iSi^ Chan et al. have shown that the calorimetric criterion is difficult to satisfy theoretically. 
^—1 Even Go models^ which are more cooperative than others (2-letter, 3-letter and 20-letter models) have K2 notably 
, smaller than 1. The studies of the Chan group are limited to few sequences and it remains, therefore, unclear if the 
■ Go modeling can meet the calorimetric requirement. One of our goals is to try to solve this problem by carrying out 
Q ' comprehensive simulations of lattice Go models. 
• ^ ■ Another dimensionless measure of thermodynamic cooperativity is Q,c defined as followsifi 
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Here x is the structural overlap with the native state and it can be identified as the probability of occupation of the 
5^ \ native basin of attraction, Tp is the folding temperature and AT is the transition widthii. Q,c may be referred to 
as the structural cooperativity. Recently, we have shown that^^ it grows with the chain length as Vic ^ , where 
the universal exponent C, ~ 2.22. This result is supported by experimental data collected for 32 two-state wild type 
proteins and by simulations for lattice models. The main goal of this paper is to consider the finite size effects on 
K2 of two-state folders with the help of lattice Go models and Monte Carlo simulations. From the definition of k,2 
it follows that it should be independent of N because both AH^h and AHcai are extensive variables. However, the 
approach to the asymptotic behavior is unclear. 

We have studied two classes of models: lattice models without side chains (LM) and lattice models with side chain 
(LMSC). For the first class, in accord with experiments, H2 was found to be scale-invariant at least up to < 80. 
However, for 78 sequences studied their average value K2 « | which is clearly smaller unity. Thus, in agreement with 
the previous resulta^iSiL&, Go LMs do not satisfy the proteinlike cooperativity principle although they are minimally 
frustrated. 

For Go LMSCs we have found that K2 scales with N as 



K2 - A^^ 



(3) 
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FIG. 1: (a) Typical native conformation of A'^ = 40 of the LMSC. The BB and SC beads occupy sites of the compact 4x4x5 
lattice, (b) Dependence of the free energy (measured in ksT) obtained for the sequence whose the native conformation is 
shown in a) on the number of native contacts at T = Tf- Since the free energy has only one local maximum at the transition 
state this sequence is a two-state folder, (c) Temperature dependence of d < x > /dT (black) and C'p (red, right-hand scale) 
for the sequence whose the native conformation is shown in a). 



before reaching the maximal value 1 at the critical value Nc ~ 135. Here exponent = 0.17 ± 0.02. These results 
suggest that K2 becomes scale-invariant for N ^ Nc and the LMSCs can meet the strict calorimetric cooperativity 
criterion only for this range of system sizes. If one assumes that the all-or-non folding takes place at K2 ^ 0.9 then the 
critical value Nc is reduced to N* = 70 (see below) . In this case the LMSC with A'' > N* can capture the calorimetric 
behavior of two-state proteins. 



II. MODELS AND METHOD 



In the coarse grained representation of LM each amino acid is represented as a single bead confined to the vertices 
of a cubic latticei^. The LMSC is also modeled on a cubic lattice by a backbone (BB) sequence of N beads, to which 
a "side" bead, representing a side chain, is attached. The peptide bond and the a-carbon are given by a single bead 
and the system has in total 27V beads. Self-avoidance is imposed, i.e. any backbone and side beads cannot occupy 
the same lattice site more than once. 

In the LMSC the energy of a conformation iai^^ 

N N N 

E = £bb ^ S^bb c, + £bs ^ 5^bs^^ + ess ^ ^rf/.Q , (4) 

2— l,j>i+l i—l.jy^i i—ljyi 

where ebb,£bs and ess are BB-BB, BB-SC and SC-SC contact energies. r'l'-,r\^ and r|J are the distances between the 
and j*'' residues for the BB-BB, BB-SC and SC-SC pairs, respectively, a is lattice spacing. Energies ebbT^bs and 
Ess are chosen to be -1 for native contacts and for non-native ones. For the LM the energy in Eq. Q) has only the 
BB term. 
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FIG. 2: The dependence of flc on K2 for A'^ = 48 LMs (solid squares, 18 sequences) and A'^ = 40 LMSC (open hexagons, 15 
sequences) (a), for aU iV LMs (b) and for all iV LMSCs (c). For LMs we have studied N = 27(17), 36(17), 48(18), 64(15) and 
80 (11) and for LMSCs - iV = 18 (30), 24 (18), 32 (20), 40 (15) and 50 (15). Numbers of studied sequences are indicated in 
parenthesis. 



The specific heat in Eq. is defined as the energy fluctuation. For LMSC the overlap function x is defined as 



1 



bb _ bb,N^ 
ij ij ) 



r'!--r':r), (5) 



bs bs,N^ 



here the upper script N refers to the native state and factor 27V^— 3A^+1 ensures that x = 1 in the native conformation. 
The last equation with only the BB term is applied to the LMs. 

The Monte Carlo simulations were carried out using the move set MSS^^^iSiii which involves single, double and triple 
bead moves. Because this move set involves multiparticle updates it is much more efficient compared to the standard 
move se1ii^. The thermodynamic properties are calculated using the multiple histogram methodic. Sequences are 
selected as two-state folders if their free energy plotted against the number of native contacts has two well-defined 
minima. 



III. RESULTS 



Fig. ^ shows the typical native conformation of the = 40 LMSC sequence. The free energy is calculated as a 
function of the number of native contacts, which is treated as an approximate reaction coordinate for Go models, and 
the corresponding results obtained at T = Tp are shown in Fig. ^p. Since the free energy profile has only one local 
maximum located at the transition state this sequence is a two-state folder. Clearly, for Go models the peaks of Cp 
and d < x> /dT coincide (Fig. 



4 



(a) (b) 




20 30 40 50 BO 70 80 3 4 5 



FIG. 3: (a) Dependence of K2 on A'^ for LMs (solid squares) and LMSCs (solid hexagons). The sequences are the same as in 
Fig. 121 (b) The same as for LMSCs in a) but data are shown in the log-log plot. The dotted line refers to K2 ~ 1. The solid 
straight line is linear fd y — —0.809 + 0.165a; (the correlation coefficient is 0.96). It crosses the K2 = 1 line at the critical value 
Nc = 135. 

Fig. |2t shows the structural cooperativity against the calorimetric one for a given value of N. As expected, flc grows 
with K2 for both LMs and LMSCs. However, the relation between these quantities becomes non-trivial if we combine 
the results for all values of N (Fig. ^jp and Fig. The correlation remains strong for LMSCs but surprisingly it 
almost vanishes for LMs. It is not clear if the absence of correlation for the LMs is intrinsic or it is merely an artifact 
of the limited set of data. Clarification of this point requires further investigation. From all sequences 176 sequences 
studied (78 LM sequences and 98 LMSC ones) 10 sequences have K2 ^ 0.85 and only one sequence which has K2 ~ 0.9 
nearly satisfies the calorimetric cooperativity principle. 

Since K2 of the LMs is not sensitive to N we can calculate its averaged value over the whole data set (78 sequences) 
and obtain 7*2 ~ f which is notably smaller than unity. Thus our results, which are in accord with Kaya and Chan, 
also su ggest that it is hard to meet the calorimetric criterion for Go LMs for any chain length. Using the relation 
K2 = a/1 — 4,(Tg/Tf)^ derived from the random energy mode l^'^'^'^ , where Tq is interpreted as the temperature below 
which folding kinetics is dominated by trapping mechanisms^^, we obtain ^ = -§= « 3. This value is far below the 

proposed ^ = 4.6& required for the two-state melting with K2 = 0.9 but higher than, say, ^ ~ 1.6 for three-letter 
models^. 

The difference in the scaling behavior of LMs and LMSCs is clearly seen in Fig. where the size effect is visible 
only for sequences with SC. From the log-log plot (Fig. ^p) we obtain exponent /i — 0.17 ± 0.02. Interpolating our 
results to K2 = 1 we find the critical length N^. ~ 135 above which LMSCs always satisfy the calorimetric cooperativity 
requirement. If we assume that the transition is two-state if K2 ^ 0.9 then the calorimetric cooperativity is satisfied 
for TV > TV*, where TV* w 70. 

IV. CONCLUSION 

We have shown that for a given system size the structural cooperativity correlates with the calorimetric one. The 
scaling of the calorimetric cooperativity has been examined for lattice two-state Go models of proteins. The LMs 
superficially mimic experiments in the sense that K2 is almost insensitive to the system sizes. However, they are not 
able to reproduce the experimental value K2 ~ 1. The rate of success for designing a Go LM which have K2 ^ 0.9 is 
rather low (about 1%). The lack of scaling of LM folding cooperativity with chain length prevents these models to 
describe the cooperativity of wild-type proteins. This appears to be an inherent deficiency of LM without side chains. 

For the Go LMSCs K2 depends on the system size up to the critical size A^c above which the full requirement of the 
calorimetric cooperativity is satisfied. Their advantage is that the criterion K2 ^ 0.9 may be satisfied for relatively 
small globular proteins (TV ~ TV* = 70). Our study shows that incorporation of side chains in protein LM represents 
a crucial modification, which makes LMSC protcin-likc. 

It should be noted that we have considered the pairwise interaction for Go models and it may be the reason 
why the calorimetric criterion is hard to fulfill even for LMSCs. The multiparticle interactions may be required to 
quantitatively describe cooperativity seen in proteins2i2S. 



5 



This work was supported by the KBN grant No 1P03B01827 and the National Science Foundation grant (NSF 
CHE-0209340). MSL thanks H. S. Chan for providing Ref. iQ. 



^ A. V. Finkelstein and O. B. Ptitsyn, Protein physics: A course of lectures, Academic Press (New York, 2002) 
^ M. S. Li and M. Cieplak, J. Phys. A 32, 5577 (1999). 

^ (a) D. Poland and H. A. Scheraga Theory of helix- coil transitions in biopolymers, Academic Press (New York, 1970); (b) T. 

E. Creighton, Proteins: Structures and Molecular Principles, W. H. Freeman & Co. (New York, 1993); (c) P.L. Privalov. 

Adv. Phys. Chem. 33, 167 (1979). 
* H. Kaya H, H.S. Chan, Phys. Rev. Lett. 85, 4823-4826 (2000) 
^ S. E. Jackson and A. R. Fersht, Biochemistry, 30, 10428 (1991). 
^ H. S. Chan, S. Shimizu, and H. Kaya, Methods in Enzymology 380, 350 (2004). 

H. Kaya and H. S. Chan, Proteins Struct. Fund. Genet. 40, 637 (2000) 
® H. Kaya and H. S. Chan, J. Mol. Biol. 326, 911 (2003). 
^ N. Go, Annu. Rev. Biophys. Bioeng. 12, 183 (1983) 

^° C. J. Camacho and D. Thirumalai. Proc. Natl. Acad. Sci. USA 90, 6369 (1993). 
" M. S. Li, D. K. Klimov, and D. Thhumalai Polymer 45, 573 (2004). 

M. S. Li, D. K. Klimov, and D. Thirumalai, Phys. Rev. Lett, (in press). 
" K. A. DiU, S. Bromberg, K. Yue, K. M. Fiebig, D. P. Yee, P. D. Thomas, and H. S. Chan. Protein Science 4, 561 (1995). 
" D. K. Klimov and D. Thirumalai, Fold. Des. 3, 127 (1998). 

M. S. Li, D. K. Klimov, and D. Thhumalai, J Phys Chem B 106, 8302 (2002). 
^® M. R. Betancourt, J. Chem. Phys. 109, 1545 (1998) 

" M. S. Li, D. K. Klimov and D. Thirumalai, Comp Phys Commun 147, 625 (2002) 

1* H. J. Hilhorst and J. M. Deutch, J Chem Phys 63, 5153 (1975) 

^'^ A. M. Ferrenberg and R. H. Swendsen, Phys Rev Lett 63 1195 (1989) 

2° H. S. Chan, Proteins Struct. Fund. Genet. 40, 543 (2000) 

J. N. Onuchic, Z. Luthey-Schuhen, and P. G. Wolynes, Annu. Rev. Phys. Chem. 48, 545 (1997). 

(a) J. Tsai, M. Gerstein, and M. Levitt. Prot. Sci. 6, 2606 (1997) 



